Scalable Sequential Spectral Clustering

نویسندگان

  • Yeqing Li
  • Junzhou Huang
  • Wei Liu
چکیده

In the past decades, Spectral Clustering (SC) has become one of the most effective clustering approaches. Although it has been widely used, one significant drawback of SC is its expensive computation cost. Many efforts have been devoted to accelerating SC algorithms and promising results have been achieved. However, most of the existing algorithms rely on the assumption that data can be stored in the computer memory. When data cannot fit in the memory, these algorithms will suffer severe performance degradations. In order to overcome this issue, we propose a novel sequential SC algorithm for tackling large-scale clustering with limited computational resources, e.g., memory. We begin with investigating an effective way of approximating the graph affinity matrix via leveraging a bipartite graph. Then we choose a smart graph construction and optimization strategy to avoid random access to data. These efforts lead to an efficient SC algorithm whose memory usage is independent of the number of input data points. Extensive experiments carried out on large datasets demonstrate that the proposed sequential SC algorithm is up to a thousand times faster than the state-of-thearts.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Spectral Clustering of Data Using Sequential Matrix Compression

Spectral clustering has attracted much research interest in recent years since it can yield impressively good clustering results. Traditional spectral clustering algorithms first solve an eigenvalue decomposition problem to get the low-dimensional embedding of the data points, and then apply some heuristic methods such as k-means to get the desired clusters. However, eigenvalue decomposition is...

متن کامل

Scalable Spectral Clustering with Weighted PageRank

In this paper, we propose an accelerated spectral clustering method, using a landmark selection strategy. According to the weighted PageRank algorithm, the most important nodes of the data affinity graph are selected as landmarks. The selected landmarks are provided to a landmark spectral clustering technique to achieve scalable and accurate clustering. In our experiments with two benchmark fac...

متن کامل

Landmark selection for spectral clustering based on Weighted PageRank

Spectral clustering methods have various real-world applications, such as face recognition, community detection, protein sequences clustering etc. Although spectral clustering methods can detect arbitrary shaped clusters, resulting thus in high clustering accuracy, the heavy computational cost limits their scalability. In this paper, we propose an accelerated spectral clustering method based on...

متن کامل

Towards Scalable Spectral Clustering via Spectrum-Preserving Sparsification

The eigendeomposition of nearest-neighbor (NN) graph Laplacian matrices is the main computational bottleneck in spectral clustering. In this work, we introduce a highly-scalable, spectrum-preserving graph sparsification algorithm that enables to build ultra-sparse NN (u-NN) graphs with guaranteed preservation of the original graph spectrums, such as the first few eigenvectors of the original gr...

متن کامل

A Novel Clustering Algorithm Based on Bayesian Sequential Partition and Its Application in Image Segmentation

In this work, we propose a novel clustering algorithm based on Bayesian Sequential Partition (BSP) and the spectral clustering algorithm. Since the BSP is capable of providing much more accurate density estimates when the sample space is of moderate to high dimension, the proposed clustering algorithm is believed to be superior to available ones when dealing with high dimensional problems. To d...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016